File uploads

The File Uploads tab will only be shown if the task is defined with an endpoint that supports this feature.

Click the Optimize File Uploads button to improve performance when replicating to file-based targets such as Amazon S3 and Hadoop. When this feature is enabled, the button label changes to Disable File Upload Optimization. Click the Disable File Upload Optimization button to disable file upload optimization.

The upload mode depends on the task type:

Full Load - Multiple files created from the same table are transferred in parallel, in no particular order.
Apply Changes - Files created from multiple tables are transferred in parallel. Files created from the same table are transferred sequentially according to creation time.
Change Data Partitioning - Files created from multiple tables and files created from the same table are transferred in parallel.

Note that disabling this option after the task has already started will require you to do one of the following:

If the task is in the Full Load stage, reload the target using the Reload Target Run option.
If the task is in the Change Processing stage, resume the task using the Start processing changes from Run option.

Limitation and considerations

Supported target endpoints

Supported by the following target endpoints only:
- Amazon S3
- Hadoop (Hortonworks and Cloudera)
- Microsoft Azure ADLS
- Databricks (Cloud Storage)
- Microsoft Azure HDInsight
- Hortonworks Data Platform (HDP)
- Google Cloud Storage
- Google Cloud Dataproc
- Amazon EMR
- Cloudera Data Platform (CDP) Private Cloud

General limitations and considerations

Post Upload Processing endpoint settings are not supported.
Optimize File Uploads cannot be enabled together with Speed partition mode.

Hadoop limitations and considerations

When replicating to a Hadoop target, only Text and Sequence file formats are supported.
Hive jobs are not supported as they will prevent the file upload.
Append is not supported when using Text file format.

Amazon S3 and Microsoft Azure ADLS limitations and considerations

When working with Reference Files, a new entry is added to the Reference File immediately after the data file is uploaded (even if the DFM file has not been uploaded yet).
The existence of the DFM file does not necessarily mean that the associated data file has also been uploaded.

Did this page help you?

If you find any issues with this page or its content – a typo, a missing step, or a technical error – please let us know!

Leave your feedback here